Automated ontology construction for unstructured text documents
نویسندگان
چکیده
Ontology is playing an increasingly important role in knowledge management and the Semantic Web. This study presents a novel episode-based ontology construction mechanism to extract domain ontology from unstructured text documents. Additionally, fuzzy numbers for conceptual similarity computing are presented for concept clustering and taxonomic relation definitions. Moreover, concept attributes and operations can be extracted from episodes to construct a domain ontology, while non-taxonomic relations can be generated from episodes. The fuzzy inference mechanism is also applied to obtain new instances for ontology learning. Experimental results show that the proposed approach can effectively construct a Chinese domain ontology from unstructured text documents. 2006 Elsevier B.V. All rights reserved.
منابع مشابه
خوشهبندی اسناد مبتنی بر آنتولوژی و رویکرد فازی
Data mining, also known as knowledge discovery in database, is the process to discover unknown knowledge from a large amount of data. Text mining is to apply data mining techniques to extract knowledge from unstructured text. Text clustering is one of important techniques of text mining, which is the unsupervised classification of similar documents into different groups. The most important step...
متن کاملOntology Construction for Structured Textual Data
This paper explores the domain of automatically extracting semantic information from given database schema and structured textual data for the purpose of Ontology construction. We examine the advantages of ontology based information search mechanisms and survey the difference techniques currently employed for automated ontology construction from given text corpus and from unstructured multimedi...
متن کاملSystematic text-mining approach for deriving aspects and patterns from domain knowledge
As the theoretical underpinnings of aspect-orientation mature, its application across the software lifecycle has expanded. An active area of research focuses on the application of aspect oriented techniques to unstructured or semi-structured requirements documents. In this context, primary issues involve the identification of early aspects and various forms of aspectual manipulation (e.g., weav...
متن کاملLinkedMDR: un modèle sémantique de représentation de corpus de documents multimédia
Projects, in the construction industry, involve the exchange of a large amount of information between several actors having different expertise and interests. Most of this information is unstructured, originated from different sources and dispersed across heterogeneous documents, thus producing implicit and explicit dependencies between them. This becomes very critical as it makes the annotatio...
متن کاملOntology Learning and Its Application to Automated Terminology Translation
a definition and some background.) Many in the computational-linguistics research community use WordNet,3 but large-scale IT applications based on it require heavy customization. Thus, a critical issue is ontology construction— identifying, defining, and entering concept definitions. In large, complex application domains, this task can be lengthy, costly, and controversial, because people can h...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Data Knowl. Eng.
دوره 60 شماره
صفحات -
تاریخ انتشار 2007